Neural network language models (NNLM) have been proved to be quite powerful for sequence modeling, including\nfeed-forward NNLM (FNNLM), recurrent NNLM (RNNLM), etc. One main issue concerned for NNLM is the heavy\ncomputational burden of the output layer, where the output needs to be probabilistically normalized and the\nnormalizing factors require lots of computation. How to fast rescore the N-best list or lattice with NNLM attracts much\nattention for large-scale applications. In this paper, the statistic characteristics of normalizing factors are investigated\non the N-best list. Based on the statistic observations, we propose to approximate the normalizing factors for each\nhypothesis as a constant proportional to the number of words in the hypothesis. Then, the unnormalized NNLM is\ninvestigated and combined with back-off N-gram for fast rescoring, which can be computed very fast without the\nnormalization in the output layer, with the complexity reduced significantly. We apply our proposed method to a\nwell-tuned context-dependent deep neural network hidden Markov model (CD-DNN-HMM) speech recognition\nsystem on the English-Switchboard phone-call speech-to-text task, where both FNNLM and RNNLM are trained to\ndemonstrate our method. Experimental results show that unnormalized probability of NNLM is quite complementary\nto that of back-off N-gram, and combining the unnormalized NNLM and back-off N-gram can further reduce the word\nerror rate with little computational consideration.
Loading....